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HOW MANY ZEROS OF A RANDOM POLYNOMIAL ARE 

REAL? 

ALAN EDELMAN AND ERIC KOSTLAN 



Abstract. We provide an elementary geometric derivation of the Kac inte- 
gral formula for the expected number of real zeros of a random polynomial 
with independent standard normally distributed coefficients. We show that 
the expected number of real zeros is simply the length of the moment curve 
(1, t, . . . , t n ) projected onto the surface of the unit sphere, divided by it. The 
probability density of the real zeros is proportional to how fast this curve is 
traced out. 

We then relax Kac's assumptions by considering a variety of random sums, 
series, and distributions, and we also illustrate such ideas as integral geometry 
and the Fubini-Study metric. 
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1. Introduction 

What is the expected number of real zeros E n of a random polynomial of degree 
n? If the coefficients are independent standard normals, we show that as n — > oo, 

2 2 
E n = -login) + 0.6257358072... + — + 0(l/n 2 ) . 
7r mr 

The - log n term was derived by Kac in 1943 [26] , who produced an integral formula 
for the expected number of real zeros. Papers on zeros of random polynomials 
include [3], [16], [23], [34], [41, 42] and [36]. There is also the comprehensive book 
of Bharucha-Reid and Sambandham [2] . 

We will derive the Kac formula for the expected number of real zeros with an 
elementary geometric argument that is related to the Buffon needle problem. We 
present the argument in a manner such that precalculus level mathematics is suffi- 
cient for understanding (and enjoying) the introductory arguments, while elemen- 
tary calculus and linear algebra are sufficient prerequisites for much of the paper. 
Nevertheless, we introduce connections with advanced areas of mathematics. 

A seemingly small variation of our opening problem considers random nth degree 
polynomials with independent normally distributed coefficients, each with mean 
zero, but with the variance of the i th coefficient equal to (") (see [4], [31], [46]). 
This particular random polynomial is probably the more natural definition of a 
random polynomial. It has 

E n = s/n 

real zeros on average. 

As indicated in our table of contents, these problems serve as the departure point 
for generalizations to systems of equations and the real or complex zeros of other 
collections of random functions. For example, we consider power series, Fourier 
series, sums of orthogonal polynomials, Dirichlet series, matrix polynomials, and 
systems of equations. 

Section 2 begins with our elementary geometric derivation. Section 3 shows how 
a large class of random problems may be covered in this framework. In Section 4 we 
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reveal what is going on mathematically. Section 5 studies arbitrary distributions but 
focuses on the non-central normal. Section 6 relates random polynomials to random 
matrices, while Section 7 extends our results to systems of equations. Complex 
roots, which are ignored in the rest of paper, are addressed in Section 8. We relate 
random polynomials to the Buffon needle problem in Section 9. 

2. Random polynomials and elementary geometry 

Section 2.1 is restricted to elementary geometry. Polynomials are never men- 
tioned. The relationship is revealed in Section 2.2. 

2.1. How fast do equators sweep out area? We will denote (the surface of) 
the unit sphere centered at the origin in R" +1 by S n . Our figures correspond to 
the case n = 2. Higher dimensions provide no further complications. 

Definition 2.1. If P S S n is any point, the associated equator P± is the set of 
points of S n on the plane perpendicular to the line from the origin to P. 

This generalizes our familiar notion of the earth's equator, which is equal to 
(north pole) ± and also equal to (south pole) ± . See Figure 1. Notice that P± is 
always a unit sphere ( "great hypercircle" ) of dimension n — 1. 

Let 7(i) be a (rectifiable) curve on the sphere S n . 

Definition 2.2. Let j±, the equators of a curve, be the set {P±\P € 7}. 

Assume that 7 has a finite length |7|. Let |7j_| to be the area "swept out" by j± 
(we will provide a precise definition shortly). We wish to relate I7I to |7_l|. 

If the curve 7 is a small section of a great circle, then U7j_ is a lune, the area 
bounded by two equators as illustrated in Figure 2. If 7 is an arc of length 9, then 
our lune covers 9/tt of the area of the sphere. The simplest case is 9 = ir. We thus 
obtain the formula valid for arcs of great circles, namely, 

l-T-L I = M 
area of S n ir 

If 7 is not a section of a great circle, we may approximate it by a union of small 
great circular arcs, and the argument is still seen to apply. 

The alert reader may notice something wrong. What if we continue our 7 so 
that it is more than just half of a great circle, or what if our curve 7 spirals many 
times around a point? Clearly, whenever 7 is not a piece of a great circle, the lunes 
will overlap. The correct definition for |7j_| is the area swept out by 7(t)j_, as t 
varies, counting multiplicities. We now give the precise definitions. 



P 




P 



Figure 1. Points P and associated equators P±. 
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Figure 2. The lune U7_i_ when 7 is a great circular arc. 

Definition 2.3. The multiplicity of a point Q £ Uj± is the number of equators in 
7_l that contain Q, i.e., the cardinality of {t £ R\Q £ j(t)±}. 

Definition 2.4. We define |7j_| to be the area of U71. counting multiplicity. More 
precisely, we define |7j_| to be the integral of the multiplicity over U7^. 

Lemma 2.1. If 7 is a rectifiable curve, then 

(I) IT-L I = M 

1 ' area of S n ir ' 

As an example, consider a point P on the surface of the Earth. If we assume that 
the point P is receiving the direct ray of the sun — for our purposes, we consider 
the sun to be fixed in space relative to the Earth during the course of a day, with 
rays arriving in parallel — then P± is the great circle that divides day from night. 
This great circle is known to astronomers as the terminator (Figure 3). During the 
Earth's daily rotation, the point P runs through all the points on a circle 7 of fixed 
latitude. Similarly, the Earth's rotation generates the collection of terminators 7j_. 

The multiplicity in 7^ is two on a region between two latitudes. This is a fancy 
mathematical way of saying that unless you are too close to the poles, you witness 
both a sunrise and a sunset every day! The summer solstice is a convenient example. 
P is on the Tropic of Cancer and Equation (1) becomes 

2 x (The surface area of the Earth between the Arctic/ Antarctic Circles) 

The surface area of the Earth 
The length of the Tropic of Cancer 

7r x (The radius of the Earth) 

or equivalently 

The surface area of the Earth between the Arctic/ Antarctic Circles 

The surface area of the Earth 
The length of the Tropic of Cancer 

The length of the Equator 

2.2. The expected number of real zeros of a random polynomial. What 
does the geometric argument in the previous section and formula (1) in particular 
have to do with the number of real zeros of a random polynomial? Let 

p(x) = a + aix H h a n x n 
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Figure 3. On the summer solstice, the direct ray of the sun 
reaches P on the Tropic of Cancer 7. 



be a non-zero polynomial. Define the two vectors 

/ ao \ /I 



01 

a-2 



and v(t) 



\ a n J 



The curve in M™ +1 traced out by v(t) as t runs over the real line is called the 
moment curve. 

The condition that x — t is a zero of the polynomial ao + a\x + • • • + a n x n is 
precisely the condition that a is perpendicular to v(t). Another way of saying this 
is that v(t)± is the set of polynomials which have t as a zero. 

Define unit vectors 



a a 



-r(t) = v(t)/\Ht)l 



As before, "f(t)± corresponds to the polynomials which have t as a zero. 

When n = 2, the curve 7 is the intersection of an elliptical (squashed) cone and 
the unit sphere. In particular, 7 is not planar. If we include the point at infinity, 
7 becomes a simple closed curve when n is even. (In projective space, the curve is 
closed for all n.) The number of times that a point a on our sphere is covered by 
an equator is the multiplicity of a in 71. This is exactly the number of real zeros 
of the corresponding polynomial. 

So far, we have not discussed random polynomials. If the Oj are independent 
standard normals, then the vector a is uniformly distributed on the sphere S n since 
the joint density function in spherical coordinates is a function of the radius alone. 
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Figure 4. When n = 2, 7 is the intersection of the sphere and 
cone. The intersection is a curve that includes the North Pole and 
a point on the Equator. 

What is E n = the expected number of real zeros of a random polynomial? A 
random polynomial is identified with a uniformly distributed random point on the 
sphere, so E n is the area of the sphere with our convention of counting multiplicities. 

Equation (1) (read backwards!) states that 



Our question about the expected number of real zeros of a random polynomial is 
reduced to finding the length of the curve 7. We compute this length in Section 



When n = 2, 7 is the intersection of the sphere and cone (Figure 4). The 
intersection is a curve that includes the North Pole and a point on the Equator. 

2.3. Calculating the length of 7. We invoke calculus to obtain the integral 
formula for the length of 7 and hence the expected number of zeros of a random 
polynomial. The result was first obtained by Kac in 1943. 

Theorem 2.1 (Kac formula). The expected number of real zeros of a degree n poly- 
nomial with independent standard normal coefficients is 



En = -N- 



2.3. 



(2) 
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Proof. The standard arclength formula is 

poo 

M= / \W(t)\\dt. 



We may proceed in two different ways. 

Method I (Direct approach). To calculate the integrand, we first consider any dif- 
ferentiable v(t) : M — ► R rl+1 . It is not hard to show that 

, m ( v(t) \ ' [v(t) ■ v(t)]v'(t) [v(t) ■ v'(t)}v(t) 
7U \^v(t)-v(t)J [v(t)-v(t)}^ 

and therefore, 

( v(t) V ( v{t) V 



\y/v(t) ■ V(t) J \y/v(t) ■ V(t) J 

[v(t)-v(t)][v'(t)-v'(t)]-[v(t)-v'(t)} 2 



Ht)-v(t)} 2 

If v(t) is the moment curve, then we may calculate ||7'(t)|| with the help of the 
following observations and some messy algebra: 

i _ f 2n+2 

v(t) ■ v(t) = l + t 2 + t 4 + --- + t 2n = '_ ; 

v'(t) ■ v(t) = t + 2t 3 + M 5 + ■■■ + nt 2n - x 

1 d fl-t 2n+2 \ _ t (l-t 2n -nt 2n + nt 2n + 2 ) 



2dt\l-t 2 J (t 2 - l) 2 

v'(t) ■ v'{t) = 1 + At 2 + 9t 4 + ■ ■ ■ + n 2 t 2n - 2 



I d d (l-t 2n+2 \ _ t 2n + 2 - t 2 - 1 + t 2n (nt 2 - n - 1) 

~ 1t~dt~dt V i-t 2 ) ~ (f - i) 3 



Thus we arrive at the Kac formula: 



_ 1 f°° yj{t 2n + 2 - I) 2 - {n + l)H 2n (t 2 - l) 2 

n ~^]-oo (t 2 -l)(t 2 ^ 2 -l) ^ 



i r 

* J-, 



(n + l) 2 t 2n 



{t 2 - I) 2 (t 2n + 2 - l) 2 



Method II (Sneaky version). By introducing a logarithmic derivative, we can avoid 
the messy algebra in Method I. Let v(t) : M — > M. n+1 be any differentiable curve. 
Then it is easy to check that 



2 

(3) log[v(x) ■ v(y)} 



= \W(tw 

y=x=t 

Thus we have an alternative expression for ||7'(i)|| 2 - 
When v(t) is the moment curve, 



I — (xv) n+1 

v{x) ■ v(y) = 1 + xy + x 2 y 2 + ■■■+ x n y n = — , 

l xy 
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the Kac formula is then 

dt. 



1 roo Q 2 1 _( xy)n - 
n J-oo V dxdy 1 - xy 



y=x=t 



This version of the Kac formula first appeared in [31]. In Section 4.4, we relate this 
sneaky approach to the so-called "Fubini- Study" metric. □ 

2.4. The density of zeros. Up until now, we have focused on the length of 7 = 
{-f(t)\ — 00 < t < 00} and concluded that it equals the expected number of zeros 
on the real line multiplied by ir. What we really did, however, was compute the 
density of real zeros. Thus 




is the expected number of real zeros per unit length at the point ( e I. This is 
a true density: integrating p n (t) over any interval produces the expected number 
of real zeros on that interval. The probability density for a random real zero is 
p n (t)/E n . It is straightforward [26, 27] to see that as n — > 00, the real zeros are 
concentrated near the point t = ±1. 

The asymptotic behavior of both the density and expected number of real zeros 
is derived in the subsection below. 

2.5. The asymptotics of the Kac formula. A short argument could have shown 
that E n ~ f logra [26], but since several researchers, including Christensen, Sam- 
bandham, Stevens, and Wilkins have sharpened Kac's original estimate, we show 
here how successive terms of the asymptotic series may be derived, although we 
will derive only a few terms of the series explicitly. The constant C\ and the next 
term were unknown to previous researchers. See [2, pp. 90-91] for a summary 
of previous estimates of C\ . 

Theorem 2.2. As n — > 00, 



E n = - log(n) + d + — + 0(l/n 2 ) , 
7r nn 



where 



= 0.6257358072... . 

Proof. We now study the asymptotic behavior of the density of zeros. To do this, 
we make the change of variables t = 1 + x/n, so 

/>oo 

E n = 4 p n (x) dx , 
Jo 

where 




n 4 (n + 1) 2 (1 + x/n) 2 " 

z 2 (2n + .T) 2 ~ [(l + a;/n) 2 "+ 2 -l] 2 

is the (transformed) density of zeros. Using 

„2 ~ 



( 1+ i)"-«"( 1 -5:) +0 < 1 /-'»- 
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we see that for any fixed x, as n —> oo, the density of zeros is given by 



(4) 

where 



Pn{x) = Poo{x) 



x{2 - x) 
2^ 



Poo{x) 



Poo(x) EE - 



f 



4e 



-2x 



+ 0(l/n 2 ) , 



1/2 



-2x\2 



x 2 (1 — e 

This asymptotic series cannot be integrated term by term. We solve this problem 
by noting that 



(5) 



X [x > 1] 



1 



X[x > 1] 1 



2-kx 2ir(2n + x) 
where we have introduced the factor 



2ttx 



Ann 



+ 0(l/n 2 ) , 



X [x > 1] = 



1 if x > 1, 
if x < 1 



to avoid the pole at x = 0. Subtracting (5) from (4), we obtain 
X[x>l] 1 



Pn{x) 



2ttx 

Poo{x) - 



2ir(2n + x) 

X[x > 1] 
2ixx 



x(2 - x) 
2n 



Poo(x) 




TZZ + 0(l/n 2 ) . 



We then integrate term by term from to oo to get 

f°° 1 
J p n (x) dx - — log(2n) 

X[x > 1] 



Poc(x) 



2ttx 



dx 



1 

2nir 



+ 0{l/n 2 ) 



The theorem immediately follows from this formula and one final trick: we replace 
x[x > l]/x with l/(x + 1) in the definition of C\ so we can express it as a single 
integral of an elementary function. □ 

3. Random functions with central normal coefficients 

Reviewing the discussion in Section 2, we see that we could omit some members 
of our basis set {1, x, x 2 , . . . ,x n } and ask how many zeros are expected to be real 
of an nth degree polynomial with, say, its cubic term deleted. The proof would 
hardly change. Or we can change the function space entirely and ask how many 
zeros of the random function 

a + &i sin(x) + a 2 e^ 

are expected to be real — the answer is 0.63662. The only assumption is that the 
coefficients are independent standard normals. If / , /i, ...,/„ is any collection of 
rectifiable functions, we may define the analogue of the moment curve 

/ hit) \ 
hit) 



(6) 



v(t) 



V fn(t) J 
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The function ^ ||7'(t)|| is the density of a real zero; its integral over R is the expected 
number of real zeros. 

We may relax the assumption that the coefficient vector a = (ao, . . . ,a n ) T con- 
tains independent standard normals by allowing for any multivariate distribution 
with zero mean. If the a% are normally distributed, E(a) = and E(aa T ) = C\ 
then a is a (central) multivariate normal distribution with covariance matrix C. 
It is easy to see that a has this distribution if and only if C~ 1 / 2 a is a vector of 
standard normals. Since 

a-v(t) =C- 1 ' 2 a-C 1 ' 2 v{t), 

the density of real zeros with coefficients from an arbitrary central multivariate 
normal distribution is 

(7) -||w'(i)||, where w(t) = C 1/2 v(t), and w{t) = w{t)/\\w(t)\\. 

7T 

The expected number of real zeros is the integral of ^||w'(£)||. 
We now state our general result. 

Theorem 3.1. Let v(t) = (fo(t),... , f n (t)) T be any collection of differ- 
entiable functions and ao, ■ ■ ■ ,a n be the elements of a multivariate normal dis- 
tribution with mean zero and covariance matrix C. The expected number of real 
zeros on an interval (or measurable set) I of the equation 

oo/o(*) + oi/i(t) + • • • + a n f n (t) - 

is 

[ -\\W{t)\\dt, 

J I 7T 

where w is defined by Equations (7). In logarithmic derivative notation this is 



Geometrically changing the covariance is the same as changing the inner product 
on the space of functions. 

We now enumerate several examples of Theorem 3.1. We consider examples for 
which v(x) T Cv(y) is a nice enough function of x and y that the density of zeros 
can be easily described. For a survey of the literature, see [2], which also includes 
the results of numerical experiments. In our discussion of random scries, proofs 
of convergence are omitted. Interested readers may refer to [45]. We also suggest 
the classic book of J. -P. Kahane [28], where other problems about random series of 
functions are considered. 



3.1. Random polynomials. 



3.1.1. The Kac formula. If the coefficients of random polynomials are independent 
standard normal random variables, we saw in the previous section that from 

(8) v(xfCv(y) = 1 \ {XV) ^\ 

i xy 

we can derive the Kac formula. 
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3.1.2. A random polynomial with a simple answer. Consider random polynomials 

a + a\x H h a n x n , 

where the a, are independent normals with variances (™) . Such random polynomials 
have been studied because of their mathematical properties [31, 46] and because of 
their relationship to quantum physics [4]. 
By the binomial theorem, 



v(xyCv(y) = ^l k )x«y* = (l + xyr. 



k=0 

We see that the density of zeros is given by 

P(t) 



This is a Cauchy distribution, that is, arctan(i) is uniformly distributed on [— ir/2, ir/2]. 
Integrating the density shows that the expected number of real zeros is y/n. As we 
shall see in Section 4.1, this simple expected value and density is reflected in the 
geometry of 7. 

As an application, assume that p(t) and q(t) are independent random polynomi- 
als of degree n with coefficients distributed as in this example. By considering the 
equation p(t) — tq(t) = 0, it is possible to show that the expected number of fixed 
points of the rational mapping 

p(t)/q(t) : R U{oo} -» R U{oo} 

is exactly \Jn + 1. 

3.1.3. Application: Spijker's lemma on the Riemann sphere. Any curve in 1" can 
be interpreted as v(t) for some space of random functions. Let 

= a(t) + ib(t) 
[ ' c(t) + id{t) 

be any rational function, where a, b, c, and d are real polynomials of a real variable 
t. Let 7 be the stereographic projection of r{t) onto the Riemann sphere. It is not 
difficult to show that 7 is the projection of the curve 

(/o(t),/i(t),/ 2 (i)) 

onto the unit (Riemann) sphere, where / = 2(ac + bd),f\ — 2(bc — ad), and 
f 2 = a 2 + b 2 — c 2 — d 2 . The geometry is illustrated in Figure 5. 

Therefore the length of 7 is tt times the expected number of real zeros of the 
random function 

ao/o + + "2/2, 

where the at are independent standard normals. For example, if a, b, c, and d are 
polynomials of degrees no more than n, then any such function has degree at most 
2n, so the length of 7 can be no more than 2nn. By taking a Mobius transformation, 
we arrive at Spijker's lemma: 

The image, on the Riemann sphere, of any circle under a complex rational mapping, 
with numerator and denominator having degrees no more than n, has length no 
longer than 2nir. 

This example was obtained from Wegert and Trefethen [51]. 
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3.1.4. Random sums of orthogonal polynomials. Consider the vector space of poly- 
nomials of the form J2k=o a kPk(x) where a k are independent standard normal ran- 
dom variables and where {Pk(x)} is a set of normalized orthogonal polynomials 
with any non-negative weight function on any interval. The Darboux-Christoffcl 
formula [21, 8.902] states that 

Y P k ( X )P k (y) = ( JS-) P n {v)Pr+l(x) - P n (*)Pr+l(v) 

where q n (resp. q n+ \) is the leading coefficient of P n (resp. P n +i)- With this 
formula and a bit of work, we see that 

p(t) = ^y/2G>(t)-G*(t), 



where 

na\ = 

dt to dt\ P n {t) 

This is equivalent to formula (5.21) in [2]. Interesting asymptotic results have been 
derived by Das and Bhatt. The easiest example to consider is that of random sums 
of Chcbyshcv polynomials, for which the density of zeros is an elementary function 
of n and t. 

3.2. Random infinite series. 

3.2.1. Power series with uncorrelated coefficients. Consider a random power series 

f{x) = a + a\X + a 2 x 2 + • • • , 

where afc are independent standard normal random variables. This has radius of 
convergence one with probability one. Thus we will assume that — 1 < x < 1. In 
this case, 

v(xfcv(y) = y 1 ^- 

The logarithmic derivative reveals a density of zeros of the form 

p[t) = w^W) ■ 
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We see that the expected number of zeros on any subinterval [a, b] of (—1, 1) is 

1 (l-g)(l + &) 
2tt 8 (l + a)(l-6)' 

This result may also be derived from the original Kac formula by letting n — > oo. 

3.2.2. Power series with correlated coefficients. What effect does correlation have 
on the density of zeros? We will consider a simple generalization of the previous 
example. Consider the random power series 



2 



f(x) — ao + aix + ci2X 

where a k are standard normal random variables, but assume that the correlation 
between a k and a k +i equals some constant r for all k. Thus the covariance matrix 
is tridiagonal with one on the diagonal and r on the superdiagonal and subdiagonal. 
In order to assure that this matrix be positive definite, we will assume that \r\ < 5. 
By the Gershgorin Theorem the spectral radius of the covariance matrix is less 
than or equal to 1 + 2r, and therefore the radius of convergence of the random 
sequence is independent of r. Thus we will, as in the previous example, assume 
that — 1 < x < 1. We see that 

/ \Trt 1 s l + r{x + y) 

v(x) Cv(y) = — , 

l-xy 

so 




1 



2 ' 



(1-t 2 ) 2 (l + 2rt) 



Notice that the correlation between coefficients has decreased the density of zeros 
throughout the interval. 

3.2.3. Random entire functions. Consider a random power series 

f(x) = a + a\x + a 2 x 2 H , 

where a k are independent central normal random variables with variances 1/fc!, i.e., 
the covariance matrix is diagonal with the numbers 1/fc! down the diagonal. This 
series has infinite radius of convergence with probability one. Now clearly 

v(x) T Cv(y) = e xy , 

so p{t) = 1/ir. In other words, the real zeros are uniformly distributed on the real 
line, with a density of 1/n zeros per unit length. 

3.2.4. Random trigonometric sums and Fourier series. Consider the trigonometric 
sum 

00 

a k cos v k 6 + b k sin v k 9, 

where a k and b k are independent normal random variables with means zero and 
variances o\. Notice that 

oc oc 

v(x) T Cv(y) = ^ cr^(sini/fea;sini/fcj/ + cos v k x cos v k y) — ^ g\ cos v k {x — y), 

fc=0 k=0 
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and we see that the density of roots is constant. Thus the real zeros of the random 
trigonometric sum are uniformly distributed on the real line, and the expected 
number of zeros on the interval [a, b] is 



Note that the slower the rate of convergence of the series, the higher the root density. 
For example, if — fc~ 3 / 2 and — k, then the series converges uniformly with 
probability one, but the root density is infinite. 

The similarity between this formula and the Pythagorean theorem is more than 
superficial, as we will see when we discuss the geodesies of flat tori in Section 4.2. 
Several authors, including Christensen, Das, Dunnage, Jamrom, Maruthachalam, 
Quails, and Sambandham [2] have derived results about the expected number of 
zeros of these and other trigonometric sums. 

3.2.5. Random Dirichlet series. Consider a random Dirichlet series 

/(*) = «i + ^ + & + ' ' ' > 

where a*; are independent standard normal random variables. This converges with 
probability one if x > 1/2. We see that 



00 _^ 

v(x) T Cv(y) = ^ 1 ^=C(x + y) 



k=\ 



and that the expected number of zeros on any interval [a, b],a > 1/2, is 

±J VpogC(2t)]"dt. 

4. Theoretical considerations 

4.1. A curve with more symmetries than meet the eye. We return to the 
example in Section 3.1.2 and explore why the distribution of the real zeros was so 
simple. Take the curve v(t) and make the change of variables t = tan# and scale 
to obtain 

7(0) = (cos n 0)(v(tan0)). 

Doing so shows that 



1(0) 



/2 

( 

1/2 



\ 



(?) ' cos"" 1 ^ sin 
\{) 1/2 cos"- 2 9 sin 2 9 



v c:) 1/2 ^ 



9 



i.e., 7fc(0) = (fe) 1 ^ 2 cos" _fc 9sin k 9, where the dimension index k runs from to n. 

We have chosen to denote this curve the super-circle. The binomial expansion of 
(cos 2 9 + sin 2 9) n = 1 tells us that our super-circle lives on the unit sphere. Indeed 
when n = 2, the super-circle is merely a small-circle on the unit sphere in R 3 . When 
n = 1, the super-circle is the unit circle in the plane. 

What is not immediately obvious is that every point on this curve "looks" exactly 
the same. We will display an orthogonal matrix Q(<j>) that rotates R" +1 in such a 
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manner that each and every point on the super-circle 7(0) is sent to 7(0 + </»). To 
do this, we show that 7 is a solution to a "nice" ordinary differential equation. 
By a simple differentiation of the fcth component of 7(0), we see that 

Jj7fc(0) = Oiklk-i{0) - a fe+ i7fe+i(0), k = 0, . .. ,n, 



where = \Jk{n + 1 — k). In matrix-vector notation this means that 
(9) 



—7(0) = Ay{B), where A 
do 



( -a x 

ai — a.2 
a 2 



V 



-Qf3 



Oin-1 -a„ 



a r , 



/ 



i.e., A has the a,- L on the subdiagonal, the — a,- L on the superdiagonal, and every- 
where else, including the main diagonal. 

The solution to the ordinary differential equation (9) is 



(10) 



7(0) 



,A0 



7(0). 



The matrix Q(4>) = e M is orthogonal because A is anti-symmetric, and indeed Q(<f>) 
is the orthogonal matrix that we promised would send 7(0) to 7(0 + </>). We suspect 
that (10) with the specification that 7(0) = (1,0, .. . , 0) T is the most convenient 
description of the super-circle. Differentiating (10) any number of times shows 
explicitly that 



dei 



(0). 



In particular, the speed is invariant. A quick check shows that it is yfn. If we let 
run from — n/2 to ir/2, we trace out a curve of length iry/n. 

The ideas here may also be expressed in the language of invariant measures for 
polynomials [31]. This gives a deeper understanding of the symmetries that we will 
only sketch here. Rather than representing a polynomial as 

p(t) =a + a x t + a 2 t 2 H h a n t n , 

we homogenize the polynomial and consider 

P(tuh) = dot? + a^q- 1 + ■■■ + a^q- 1 + a n q. 
For any angle a, a new "rotated" polynomial may be defined by 

Pa{ti, £2) = p(ti cos a + £2 sin a, —t\ sin a + t 2 cos a). 

It is not difficult to show directly that if the at are independent and normally 
distributed with variance (™), then so are the coefficients of the rotated polynomial. 
The symmetry of the curve and the symmetry of the polynomial distribution are 
equivalent. An immediate consequence of the rotational invariance is that the 
distribution of the real zeros must be Cauchy. 
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4.2. Geodesies on flat tori. We now a take a closer look at the random trigono- 
metric sums in Section 3.2.4. Fix a finite interval [a, b}. For simplicity assume 
that 



1 . 



fc=0 

The curve -f(0) is given by 

((To cos vqO, Co sin uq0, . . 

This curve is a geodesic on the flat (n 

(cro cos 0o, (To sin O , . . ■ , a n cos n , a n sin n ) . 

Therefore if we lift to the universal covering space of the torus, 7 becomes a straight 
line in K™ +1 . By the Pythagorean theorem, the length of 7 is 



. , cr„ cos v n 0,a n sin v n 0) . 
l)-dimensional torus 



(b-a) 



\ 



E^' 



which equals 7r times the expected number of zeros on the interval [a, b] . 

Now replace [a, b] with (—00,00). If Vi/vj is rational for all i and j, then 7 is 
closed; otherwise it is dense in some subtorus. 

Now consider the -y(0) discussed in Section 4.1. Observe that "f{x) T -f(y) = 
cos n (x — y). Thus if we choose the and the correctly, the polynomial example 
in Section 3.1.2 becomes a special case of a random trigonometric sum. Thus the 
super-circle discussed in Section 4.1 is a geodesic on a flat torus. 

4.3. The Kac matrix. Mark Kac was the first mathematician to obtain an exact 
formula for the expected number of real zeros of a random polynomial. Ironically, he 
also has his name attached to a certain matrix that is important to understanding 
random polynomials, yet we have no evidence that he ever made the connection. 
The (n + 1) x (n + 1) Kac matrix is defined as the tridiagonal matrix 

/On \ 
1 n-l 

2 n-2 



Sn 







1 

\ n / 

The history of this matrix is documented in [49] , where there are several proofs 
that S n has eigenvalues —n, —n + 2, — n + 4, . . . , n — 2, n. One of the proofs is 
denoted as "mild trickery by Kac" . We will derive the eigenvalues by employing a 
different trick. 



Theorem 4.1. The eigenvalues of S n are the integers 2k — n for k = 0, 1, 
Proof. Define 



,n. 



f k (x) = sinh fc (a;) cosh"~ fc (x), k = 0, . . . ,n, 

g k (x) = (sinh(x) + cosh(x)) fc (sinh(x) - cosh(x))"~ fc , k — 0, . . . , n. 

If V is the vector space of functions with basis {fk(x)}, then the gk(x) are clearly 
in this vector space. Also, j^fk{x) = kfk-i(x) + (n — k)fk+i(x), so that the Kac 
matrix is the representation of the operator d/dx in V. We actually wrote gk(x) 
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in a more complicated way than we needed to so that we could emphasize that 
gk{x) G V. Actually, gk{x) = exp((2fc — n)x) is an eigenfunction of d/dx with 
eigenvalue 2k — n for k — 0, . . . , n. The eigenvector is obtained by expanding the 
above expression for gk{x). □ 

An alternative tricky proof using graph theory is to consider the 2™ x 2™ incidence 
matrix of an n-dimensional hypercube. This matrix is the tensor (or Kroncckcr) 
product of ( i J ) n times, so the eigenvalues of this matrix arc sums of the form 
Y^i=i i- e -; this matrix has 2™ eigenvalues all of which have the form 2k — n 
for k = 0, . . . , n. This matrix is closely related to the discrete Laplacian of the 
hypercube graph and the n-fold discrete Fourier transform on a grid with edge 
length 2. So far we have the right set of eigenvalues but the wrong matrix. However, 
if we collapse the matrix by identifying those nodes with k = 0,1,... , n ones in 
their binary representation, we obtain the (n+ 1) x (n+ 1) Kac matrix transposed. 
(Any node with k ones has k neighbors with k — 1 ones and n — k neighbors with 
k + 1 ones.) It is an interesting exercise to check that by summing eigenvectors 
over all possible symmetries of the hypercube, the projected operator inherits the 
eigenvalues 2k — n (k = 0, . . . , n), each with multiplicity 1. □ 

We learned of this second tricky proof from Persi Diaconis, who explained it 
to us in terms of random walks on the hypercube and the Ehrcnfcst urn model of 
diffusion [8, 9]. The Kac matrix is also known as the "Clement matrix" in Higham's 
Test matrix toolbox for Matlab [25] because of Clement's [7] proposed use of this 
matrix as a test matrix. Numerically, it can be quite difficult to obtain all of these 
integer eigenvalues. 

The symmetrized Kac matrix looks exactly like the matrix A in (9) without any 
minus signs. Indeed iS n is similar to the matrix in (9). 

4.4. The Fubini-Study metric. We now reveal the secret that inspired the 
"sneaky" approach to the calculation of the length of the curve 7(f) = w(£)/||u(t)|| 
that appears in Section 2.3. (See (3).) The secret that we will describe is the 
Fubini-Study metric. 

An interesting struggle occurs in mathematics when quotient spaces are defined. 
Psychologically, it is often easier to think of an individual representative of an 
equivalence class rather than the class itself. As mathematicians, we train ourselves 
to overcome this; but practically speaking, when it is time to compute, we still must 
choose a representative. As an example, consider vectors v € 1" and its projection 
w/IMI on t° the sphere. (If we do not distinguish ±i>/||?;||, we are then in projective 
space.) The normalization obtained from the division by \\v\\ is a distraction that 
we would like to avoid. 

Perhaps a more compelling example may be taken from the set ofnxp matrices 
M with n > p. The Grassman manifold is obtained by forming the equivalence 
class of all rank p matrices M whose columns span the same subspace of K" . To 
compute a canonical form for M may be an unnecessary bother that we would like 
to avoid. When p = 1, the Grassman manifold reduces to the projective space 
example in the previous paragraph. 

The Fubini-Study metric on projective space allows us to keep the v for our co- 
ordinates in the first example. The more general version for the Grassman manifold 
allows us to keep the M. A historical discussion of Fubini's original ideas may be 
found in [35]. We have seen only the complex version in the standard texts [22, 29], 
but for simplicity, we discuss the real case here. 
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We see that *y(t) is independent of ||u(i)||, i.e., it is invariant under scaling. The 
logarithmic derivative is specifically tailored to be invariant under scaling by any 
A(t): 

d 2 d 2 

log[\(x)v(x) ■ X(y)v(y)\ = ——- {log[v(x) ■ v(y)] + log(A(x)) + log(A(y))} 



dxdy dxdy 

d 2 



\og[v{x) ■ v{y)}. 



dxdy 

The logarithmic derivative may appear complicated, but it is a fair price to pay to 
eliminate ||v(i)||. The length of the projected version of v(t) traced out by t € [a, b] 
is 

rb 




d 2 

\og[v(x) ■ v(y)] 



dt. 

-x—t 



dxdy 

The integrand is the square root of the determinant of the metric tensor. This is 
the "pull-back" of a metric tensor on projective space. 

The Grassman version is almost the same; it takes the form 




Q2 

logdet[M(x) T M(y)], 



dxdy 

where M(t) denotes a curve in matrix space 



dt, 

y=x=t 



4.5. Integral geometry. Integral geometry (sometimes known as Geometric 
Probability) relates the measures of random manifolds and their intersections. Ref- 
erences (such as [43], [47], [44, p. 253], and [5, p. 73]) vary in terms of setting and 
degree of generality. 

For our purposes we will consider two submanifolds M and N of the sphere 
S m+n , where M has dimension m and N has dimension n. If Q is a random 
orthogonal matrix (i.e., a random rotation), then 

(11) E(#(MnQN)) = -^^\M\\N\. 

In words, the formula states that the expected number of intersections of M with 
a randomly rotated TV is twice the product of the volumes of M and N divided by 
the product of the volumes of spheres. 

For us "number of intersections" has the interpretation of "number of zeros" 
so that we may relate the average number of zeros with the lengths of curves (or 
more generally volumes of surfaces). We will apply this formula directly when we 
consider random systems of equations in Section 7. 

If the manifold N is itself random and independent of Q, then the formula above 
is correct with the understanding that |iV| refers to the average volume of N. This 
formulation is needed for Lemma 6.1. 

The factor of 2 often disappears in practical computations. Mathematically, all 
of the action is on the half-sized projective space rather than on the sphere. 

4.6. The evaluation mapping. The defining property of a function space is that 
its elements can be evaluated. To be precise, if F is a vector space of real-valued 
functions defined on some set S, we have an evaluation mapping, ev : S — > F*, 
defined by ev(s)(f) — f(s), that tells us everything about the function space. 
Conversely, if we are given any vector space F and any function from S to F* , we 
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may take this function to be the evaluation mapping and thus convert F into a 
function space. 

Pick an element / of F (at random). The annihilator of / is the set f± = {9e 
F*\6(f) — 0}. Checking definitions, we see that the intersections of f± with the 
image of ev correspond to zeros of /. Thus the average number of intersections is 
the average number of zeros. 

Now let us choose an inner product for F, or equivalently, let us choose a cen- 
tral normal measure for F. If ev(S) is a rectifiable curve, we may apply integral 
geometry and conclude the following: 

Theorem 4.2. The expected number of zeros is the length of the projection of the 
image of the evaluation mapping onto the unit sphere in the dual space divided by 

7T. 

Thus the expected number of zeros is proportional to the "size" of the image of the 
evaluation mapping. 

The inner product also gives rise to an isomorphism i : F — > F* , defined by 
*•(/)(#) — / ' 9- ^ is just a matter of checking definitions to see that 



is the dual of the evaluation mapping. Thus v(t) is the natural object that describes 
both the function space F and the choice of inner product. 



This paper began by considering random polynomials with standard normal co- 
efficients, and then we realized quickly that any multivariate normal distribution 
with mean zero (the so-called "central distributions" ) hardly presented any further 
difficulty. We now generalize to arbitrary distributions, with a particular focus on 
the non-central multivariate normal distributions. The basic theme is the same: 
the density of zeros is equal to the rate at which the equators of a curve sweeps 
out area. Previous investigations are surveyed in [2]. In the closely related work 
of Rice [41] and [42, p. 52], expressions are obtained for the distributions of zeros. 
Unfortunately, these expressions appeared unwieldy for computing even the distri- 
bution for the quadratic [41, p. 414]. There is also the interesting recent work of 
Odlyzko and Poonen on zeros of polynomials with 0, 1 coefficients [40]. 

5.1. Arbitrary distributions. Given fo(t), fi(t), . . . ,f n {t), we now ask for the 
expected number of real roots of the random equation 



where we will assume that the dj have an arbitrary joint probability density function 



v(t) = t _1 eu(t) 



5. Extensions to other distributions 



a fo(t) + ai/i(t) + • • • + a n f n (t) = 0, 



a (a). 



Define v(t) e R n+1 by 




and let 



(12) 



7(*) 



v(t)/\\v(t)\\. 
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Instead of working on the sphere, let us work in W l+1 by defining r y(t)± to be the 
hyperplane through the origin perpendicular to 7(f). 

Fix t and choose an orthonormal basis such that eo = j(t) and e\ = 7'(i)/ll7'(*)ll- 
As we change t to t + dt, the volume swept out by the hyperplanes will form an 
infinitesimal wedge. (See Figure 6.) 

This wedge is the Cartesian product of a two-dimensional wedge in the plane 
span(eo,ei) with R" _1 , the entire span of the remaining n— 1 basis directions. The 
volume of the wedge is 

|| 7 '(t)|| dt [ |ei • a\a(a)da n , 

jR" = {e o -a=0} 

where the domain of integration is the n-dimensional space perpendicular to eo and 
a n denotes n-dimensional Lebesgue measure in that space. Intuitively ||7'(t)||df is 
the rate at which the wedge is being swept out. The width of the wedge is infinitcs- 
imally proportional to \e\ ■ a\, where a is in this perpendicular hyperspace. The 
factor a(a) scales the volume in accordance with our chosen probability measure. 

6 a|*W (e r a)||Y'(t)||dt 



°" ,Y(t) 



I 7^ 

Y(t+dt) || Y '(t)||dt 



Figure 6. Infinitesimal wedge area. 



Theorem 5.1. If a has a joint probability density a(a), then the density of the real 
zeros ofa Q f (t) H h a n f n (t) = is 

P(t) = \h'(t)\\ I \^tA a{a )da n = f W(t)-a\a(a)da n , 

Jj(t)-a=0 117 {t)\\ J 7 (t). a =0 

where da n is standard Lesbesgue measure in the subspace perpendicular to ^{t). 

5.2. Non-central multivariate normals: Theory. We apply the results in the 
previous subsection to the case of multivariate normal distributions. We begin by 
assuming that our distribution has mean m and covariance matrix /. We then show 
that the restriction on the covariance matrix is readily removed. Thus we assume 
that 

a(a) = {2Tt)- {n+1 V 2 e- Ha-™*) 2 ^ and m = (m , . . . , m n f . 

Theorem 5.2. Assume that (do, . . . , a n ) T has the multivariate normal distribution 
with mean m and covariance matrix I. Let "f(t) be defined as in (12). Let mo(t) 
and mi(t) be the components of m in the j(t) and j'(t) directions, respectively. 
The density of the real zeros of the equation a ifi(t) — is 
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Pn(t) = kmile-^ 2 {e" *-<*>' + ^1 mi(*)erf [^] } . 

For polynomials with identically distributed normal coefficients, this formula is 
equivalent to [2, Section 4.3C]. 

Proof. Since we are considering the multivariate normal distribution, we may rewrite 
a (a) in coordinates x , . . . ,x n in the directions eo, . . . , e„ respectively. Thus 

<j(x , ...,x n ) = (27r)-(" +1 )/ 2 e ^ n™(*)) 2 , 

where m,i{t) denotes the coordinate of m in the direction. The n-dimensional 
integral formula that appears in Theorem 5.1 reduces to 

— / \ Xl \ e -^(tf) e -^-^{t)f dxi 

after integrating out the n — 1 directions orthogonal to the wedge. From this, the 
formula in the theorem is obtained by direct integration. □ 

We can now generalize these formulas to allow for arbitrary covariance matrices 
as we did with Theorem 3.1. We phrase this corollary in a manner that is self- 
contained: no reference to definitions anywhere else in the paper is necessary. 

Corollary 5.1. Let v(t) = (fo(t), fi(t), ■ ■ ■ , fn(t)) T , and let a = (ao, . ■ ■ ,a n ) be 
a multivariate normal distribution with mean m = (m , . . . ,m n ) T and covariance 
matrix C . Equivalently consider random functions of the form 

~}2 a ifi{t) with mean fj,(t) = m f (t) + ••• + m n f n (t) and covariance matrix C. 
The expected number of real roots of the equation X^ a i/i(^) = on the interval 
[a, b] is 



\ J || 7 'WI|e-^ (t) j e -^ 4 > + y|mi(t)erf 



m 1 (t) 



V2 



dt 



where 



w(t) - C^Mt), lit) = ^L, m (t) = /£L, and rm{t) - 



Ht)\\ Mm \h'(t)\\ 

Proof. There is no difference between the equation a ■ v = and C~ x l 2 a ■ C x l 2 v = 
0. The latter equation describes a random equation problem with coefficients 
from a multivariate normal with mean C- 1/2 m and covariance matrix /. Since 
H(t)/\\w(t)\\ - 1 {t)-C- 1 ' 2 m and m' {t) / (t)\\ = i{t)-C- l l 2 m/\\i{i)\\, the result 
follows immediately from Theorem 5.2. □ 

The reader may use this corollary to compute the expected number of roots 
of a random monic polynomial. In this case m = e„ and C is singular, but this 
singularity causes no trouble. We now proceed to consider more general examples. 

5.3. Non-central multivariate normals: Applications. We explore two cases 
in which non-central normal distributions have particularly simple zero densities: 

• Case I. m (t) — m and m\{t) = 0. 

• Case II. m (t) = m\{t). 
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Case I . m (t) = m and m\{t) = 0. If we can arrange for m = m to be a constant, 
then mi (t) — and the density reduces to 

p{t) = Ui{t)\\e-^ . 

In this very special case, the density function for the mean m case is just a constant 
factor (e~2 m ) times the mean zero case. 

This can be arranged if and only if the function ||w(i)|| is in the linear space 
spanned by the /j. The next few examples show when this is possible. In parenthe- 
ses, we indicate the subsection of this paper where the reader may find the mean 
zero case for comparison. 

Example 1 (3.1.2). A random polynomial with a simple answer, even degree: Let 
fiit) =t\i = 0,...,n,aadC = diag [(?)], so that ||io(i)|| = (1 + t 2 ) n l 2 . Choose 
fi(t) = m(l + i 2 )™/ 2 , so that m (t) = m is a constant. 

For example, if n = 2 and a ,ai, and a 2 are independent standard Gaussians, 
then the random polynomial 

(do + to) + ai V2t + (a 2 + m)t 2 

is expected to have 

V2e-™ 2 /2 

real zeros. The density is 

p(t ) - 1 )H c -^ 2 /2 

Note that as m — > oo, we are looking at perturbations to the equation t 2 + 1 = 
with no real zeros, so we expect the number of real zeros to converge to 0. 



Example 2 (3.2.4). Trigonometric sums: fj,{t) = my/o^ + • • • + a 2 . 
Example 3 (3.2.2). Random power series: fj,(t) = m(l - t 2 )^ 1 / 2 . 
Example 4 (3.2.3). Entire functions: fj,(t) = me* 2 / 2 . 
Example 5 (3.2.5). Dirichlet series: 

m k 



factorization FJ i p. 



2)1, 



where = if fc is not a square, and = toJ^ ^t^jm^ if k nas t ne prime 



Case II . Too (t) = m\{t). We may pick a /z(t) for which too (t) = mi(t) by solving 
the first-order ordinary differential equation TOo(t) = to (£)/||7'(£)||. The solution 
is 



H{t)=m\\w{t)\\ exp[/ \\i{x)\\dx 
Jk 

There is really only one integration constant since the result of shifting by K can 
be absorbed into the to factor. If the resulting n(t) is in the linear space spanned 
by the fa, then we choose this as our mean. 
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Though there is no reason to expect this, it turns out that if we make this choice 
of fi(t), then the density may be integrated in closed form. The expected number 
of zeros on the interval [a, b] is 

J p(t)dt= icrf 2 (m (t)A/2) - ±-r[0,m 2 (t)} 

Example 6 (3.2.2). Random power series: Consider a power series with indepen- 
dent, identically distributed normal coefficients. In this case n(t) = j^, where 

m = (mean/standard deviation), so mo(t) = m \Jj^- A short calculation shows 
that mi(i) = mo(t). 

,,., s .,„ . v , .. ,., ,,,,, = mP *+* 2 /2 

me" 



Example 7 (3.2.3). Entire functions: In this case n(t) — me t+t > 2 , so m (t) = 

at 



Example 8 (3.2.5). Dirichlet series: This we leave as an exercise. Choose K > 
1/2. 

Theorem 5.3. Consider a random polynomial of degree n with coefficients that 
are independent and identically distributed normal random variables. Define m/0 
to be the mean divided by the standard deviation. Then as n — > oo, 

E n = -log(n) + -1 + \ - 1 - -log|m| + 0(l/n) , 

TT Z Z TT TT 

where d = 0.6257358072... is defined in Theorem 2.2 and 7 = 0.5772156649... is 
Euler' s constant. Furthermore, the expected number of positive zeros is asymptotic 
to 

l-icrf 2 (H/V2) + ir[0,m 2 ]. 

Z Z TT 

Sketch of proof. We break up the domain of integration into four subdomains: 
(— 00,— 1], [—1,0], [0,1], and [l,oo). Observe that the expected number of ze- 
ros on the first and second intervals are the same, as are the expected number of 
zeros on the third and fourth intervals. Thus we will focus on the first and third 
interval, doubling our final answer. 

The asymptotics of the density of zeros is easy to analyze on [0,1] because it 
converges quickly to that of the power series (Example 6, above). Doubling this 
gives us the expected number of positive zeros. 

On the interval (—00, —1], one can parallel the proof of Theorem 2.2. We make 
the change of variables — t = 1 + x/n. The weight due to the non-zero mean can 
be shown to be 1 + 0(l/n). Therefore, the asymptotic series for the density of the 
zeros is the same up to 0(1 /n). We subtract the asymptotic series for the density 
of the zeros of the non-central random power series and then integrate term by 
term. □ 

The i log(n) term was first derived by Sambandham. Farahmand [17] has im- 
proved on his results. 

6. Eigenvalues of random matrices 

Eigenvalues of random matrices arise in a surprising number of disciplines of 
both pure and applied mathematics. Already three major books [19], [37], [38] on 
the subject exist, each specializing in different disciplines, yet these books serve as 
mere stepping-stones to the vast literature on the subject. The book by Mehta [37] 
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covers developments of random matrices (mostly symmetric) that began with the 
work of Wigner, who modeled heavy atom energies with random matrix eigenvalues. 
Muirhead's book [38] focuses on applications to multivariate statistics, including 
eigenvalue distributions of Wishart matrices. These are equivalent to singular value 
distributions of rectangular matrices whose columns are iid multivariate normal. 
His exposition is easily read with almost no background. Girko's large book [19] 
translates his earlier books from Russian and includes more recent work as well. 

An entire semester's interdisciplinary graduate course [12] was inadequate for 
reviewing the subject of eigenvalues of random matrices. Some exciting recent 
developments may be found in books by Voiculescu, Dykema, and Nica [50] relating 
Wigner's theory to free random variables and by Faraut and Koranyi [18], which 
extend the special functions of matrix argument described in [38] from the harmonic 
analysis viewpoint. Other new areas that we wish to mention quickly concern 
matrix models for quantum gravity [1] , Lyapunov exponents [39] , and combinatorial 
interpretations of random matrix formulas [20], [24]. By no means should the 
handful of papers mentioned be thought of as an exhaustive list. 

Developers of numerical algorithms often use random matrices as test matrices 
for their software. An important lesson is that a random matrix should not be 
equated to the intuitive notion of a "typical" matrix or the vague concept of "any 
old" matrix. Random matrices, particularly large ones, have special properties of 
their own. Often there is little more information obtained from 1,000 random trials 
than from one trial [14]. 



6.1. How many eigenvalues of a random matrix are real? Assume that we 
have a random matrix with independent standard normal entries. If n is even, the 
expected number of real eigenvalues is 



while if n is odd, 



As n — > oo, 

E n ~ y/2n/ir. 

This is derived in [13] using zonal polynomials. The random eigenvalues form an 
interesting Saturn-like picture in the complex plane. Figure 7 plots normalized 
eigenvalues A/V50 in the complex plane for fifty matrices of size 50 x 50. There 
are 2,500 dots in the figure. Girko's [19] circular law (which we have not verified) 
states under general conditions that as n — > oo, X/^/n is uniformly distributed on 
the disk. If the entries are independent standard normals, a proof may be found in 
[15], where also may be found a derivation of the repulsion from the real axis that 
is clearly visible. 

Girko's circular law stands in contrast to the result that roots of random poly- 
nomials are uniformly distributed on the unit circle rather than the disk. 
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Figure 7. 2,500 dots representing normalized eigenvalues of fifty 
random matrices of size n — 50. Clearly visible are the points on 
the real axis. 



6.2. Matrix polynomials. This may come as a shock to some readers, but char- 
acteristic polynomials are a somewhat unnatural way to discuss the eigenvalues of 
a matrix. It seems irrelevant that a random matrix happens to have a random 
characteristic polynomial, so we will not discuss random characteristic polynomi- 
als any further. An analogous situation occurs in the numerical computation of 
eigenvalues, where nobody would dream of forming the characteristic polynomial. 

The proper generalization that includes polynomials and matrices as special cases 
is the so-called matrix polynomial. A matrix polynomial has the form 



where the A, are p x p matrices and t is a scalar. The solutions to det P(t) = 
are the eigenvalues of the matrix polynomial. Notice that we are not trying to set 
P{t) to be the zero matrix, but rather we are trying to find a t for which P(t) 
is a singular matrix. It is sometimes convenient to take A n = I. The standard 
eigenvalue problem takes n = 1 and A\ = I. When n = 1 and A\ ^ /, the problem 
is known as the generalized eigenvalue problem. Pure polynomials correspond to 



The beauty of random matrix polynomials is that the expected number of real 
eigenvalues depends on p by a geometric factor: 



P{t) = A + A^ + A 2 t 2 + ■ ■ 



+ A n i 



p=l. 
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Theorem 6.1. Let fo(t), ... , f n (t) be any collection of differentiable functions, 
and let Ao, . . . ,A n be p x p random matrices with the property that the p 2 
random vectors ((Ao)ij , (Ai)ij, . . . , (A n )ij) (i, j = 1, . . . ,p) are iid multivariate 
normals with mean zero and covariance matrix C. Let a p denote the expected 
number of real solutions in the interval [a, b] to the equation 

= dot [A f (t) + Aifi(t) + ■■■ + A n f n (t)] . 

We then have that 

r((p + i)/2) 
V«i = ^ r(p/2) • 

ai may be computed from Theorem 3.1. 



In particular, if all of the matrices are independent standard normals, the ex- 
pected number of real solutions is 

r((p + i)/2) 
En x ^ r(p/2) ' 

where £?„ is the quantity that appears in Theorem 2.1. The proof of Theorem 6.1 
follows from a simple consequence of the integral geometry formula. 

Lemma 6.1. Choose an interval [a,b] and a random function 

aoh(t) + Oi/i(t) H h a n f n (t), t e [a, 6], 

where the a.j are independent standard normals. Generate a random curve in ~R k 
by choosing an independent sample of k such functions. The expected length of the 
projection of this curve onto the unit sphere in R k is equal to the expected number 
of zeros of the chosen random function on the chosen interval, multiplied by ir. 

Proof. The lemma follows from (11). Let N be the random curve. Since the 
distribution of QN is the same as that of N, we may take M to be any fixed 
hyperplane, say x\ = 0. The intersections of a curve with this hyperplane are 
exactly the zeros of the first coordinate of the curve and thus the zeros of our 
random function. □ 

Notice that the expected length does not depend on k. This result generalizes 
to random embeddings of manifolds in Euclidean space. See [33] for a discussion of 
these and other random varieties. 

Proof of Theorem 6.1. We prove this theorem by using integral geometry twice to 
obtain expressions for the average length of the random curve 7 defined by 

A(t) = A fo(t) + Ai/i(t) + ■ ■ ■ + A n f n (t), 

on some interval [a, b], and -f(t) — A(t)/\\A(t)\\F- 

On the one hand, Lemma 6.1 states that the expected length of the projection 
-f(t) is ai7r. 

On the other hand, (11) may be used with M chosen to be the set of singular 
matrices on S p _1 , and N is the random curve 7. Thus the expected number of t 
for which j{t) is singular is 

(13) _!H 
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The volume of M is known [13] to be 

27rP 2 / 2 r((p+l)/2) 



IMI 



r((p/2)r((p2 - 1)/ 2 ) ' 

The average length of AT is n ai . The volume of S^ 2 is 2 7 r(P 2 - 1 )/ 2 /r(( 2 ? 2 - l)/2). 
Plugging these volumes back into (13) yields the result. □ 

7. Systems of equations 

The results that we have derived about random equations in one variable may 
be generalized to systems of m equations in m unknowns. What used to be a curve 
v(t) : M — ► R n+1 is now an m-dimensional surface v(t) : K m — > R n+1 defined in the 
same way. The random coefficients now form an m x (n + 1) matrix A. 



Theorem 7.1. Let v(t) = (fo(t), ... , f n (t)) T be any differ entiable from R m to 
M. n+ , let U be a measurable subset ofM™ 1 , and let A be a random m x (n + 1) 
matrix. Assume that the rows of A are iid multivariate normal random vectors 
with mean zero and covariance matrix C. The expected number of real roots of 
the system of equations 

Av(t) = 

that lie in the set U is 




^(^v( X fC V (y))\ v _ t 



1/2 
i dt. 



Proof. This is an application of the integral geometry formula (11). To apply this 
formula on the unit sphere S n C R n+1 , we choose a submanifold M of dimension 
m and a submanifold N of dimension n — m. 

For simplicity assume first that C = I. We take M to be the projection of 
{v(t) : t e U} to the unit sphere. For N we take the intersection of a plane of 
dimension n — m + 1 with the sphere, i.e., N = S n ~' m C S n . 

According to (11), if we intersect M with a random (n — m + l)-dimensional 
plane, the expected number of intersections is 

£(#(M n QN)) = |Sm|| g n _ m| |M||JV| = 2\M\/\S m \ = tt^T \M\. 

The Fubini-Study metric conveniently tells us that \M\ is the integral in the state- 
ment of the theorem. 

Of course, the number of real roots of Av(t) = is the number of intersections 
of M with the null-space of A (counting multiplicity). Since for the moment we 
assume that C — I, the random null-space of A is invariant under rotations, proving 
that the average number of intersections is the average number of real roots. 

For arbitrary C the entire derivation applies by replacing A with AC~ X I 2 . □ 

We now extend our previous examples to random systems of equations. 

7.1. The Kac formula. Consider systems of polynomial equations with indepen- 
dent standard normal coefficients. The most straightforward generalization occurs 
if the components of v are all the monomials {JlfeLi x1 k}i where for all k, ik < d. 
In other words, the Newton polyhedron is a hypercube. 
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Clearly, 

m d 

«(a) r u(y) = nE(*iw) fc , 

i=l k=0 

from which we see that the matrix in the formula above is diagonal and that the 
density of the zeros on M m breaks up as a product of densities on R. Thus if E^ 
represents the expected number of zeros for the system, 

The asymptotics of the univariate Kac formula shows that as d — > oo, 
-,(m) -m+l^ ( TO- 



E ™~n-^-r\—£-) (21ogdT. 

We suspect that the same asymptotic formula holds for a wide range of Newton 
polyhcdra, including the usual definition of degree: Y^k=i * fc — ^ 1^2] • 

7.2. A random polynomial with a simple answer. Consider a system of m 
random polynomials, each of the form 

where X^feLi ik < d and where the (Xj 1 ...i m are independent normals with mean zero 
and variances equal to multinomial coefficients: 

d \ rf! 



The multinomial theorem simplifies the computation of 



m + 1 \ d"'/ 2 



V (x) T ^(y) 
We see that the density of zeros is 

p(t) = n-^r K 2 } {l + t , t)(m+1)/2 - 

In other words, the zeros arc uniformly distributed on real projective space, and 
the expected number of zeros is d m / 2 . 

Shub and Smale [46] have generalized this result as follows. Consider m inde- 
pendent equations of degrees d\ , . . . ,d m , each defined as in this example. Then the 
expected number of real zeros of the system is 



\ k=l 

The result has also been generalized to underdetermined systems of equations [31]. 
That is to say, we may consider the expected volume of a random real projective 
variety. The degrees of the equations need not be the same. The key result is as 
follows. The expected volume of a real projective variety is the square root of the 
product of the degrees of the equations defining the variety, multiplied by the volume 
of the real projective space of the same dimension as the variety. For a detailed 
discussion of random real projective varieties, see [33]. 
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7.3. Random harmonic polynomials. Consider the vector space of homoge- 
neous polynomials of degree d in m + 1 variables that are harmonic, that is, the 
Laplacians of the polynomials are equal to zero. If Q is an orthogonal (m+1) x (m+ 
1) matrix, then the map that sends p(x) to p(Qx) is a linear map from our vector 
space to itself, i.e., we have a representation of the orthogonal group 0(m + 1). It is 
a classical result in Lie group theory that there is, up to a constant, a unique normal 
measure on harmonic polynomials that is invariant under orthogonal rotations of 
the argument. It follows that this representation is irreducible. 

We outline a proof by considering the invariance of v(x) T Cv(y). Assume that 
for any orthogonal matrix Q, v(Qx) T Cv(Qy) — v(x) T Cv(y). This implies that 
v(x) T Cv(y) must be a polynomial in x ■ x, x-y, and y ■ y. This is classical invariant 
theory. For proofs and discussion of such results, see [48, Vol. 5, pp. 466-486]. We 
thus deduce that v(x) T Cv(y) must be of the form 

[d/2] 

J2Pk(x-x) k (yy) k (x-y) d - 2k . 

k=0 

Setting the Laplacian of this expression to zero, we see that 

2k(m + 2d-2k- l)/3 k + (d-2k + 2){d - 2k + l)/3 k -i = 
and therefore that 

Pk _ {-l) k d\{m + 2d- 2k- 3)!! 

% ~ 2 k k\{d~2k)\{m + 2d-3y.\ ' 
Thus we see that v(x) T Cv(y) is uniquely determined (up to a constant). 

From this formula we can show that the expected number of roots for a system 
of m such random harmonic polynomial equations is 

/d(d + m- 1) 
\ m 

Because of the orthogonal invariance of these random polynomials, results hold 
in the generality of the polynomials in Section 7.2. Thus we may consider systems of 
harmonic polynomials of different degrees, or we may consider 
undcrdetermined systems, and the obvious generalizations of the above result will 
hold. Sec [32] for a detailed discussion. 

7.4. Random power series. For a power series in m variables with independent 
standard normal coefficients, we see that the density of zeros on R™ 1 breaks up as 
the product of m densities: 

/ n m+i _ / m + 1 \ -i^r 1 

Notice that the power series converges with probability one on the unit hypercube, 
and that at the boundaries of this domain the density of zeros becomes infinite. 

7.5. Random entire functions. Consider a random power series 

f(x)= ]T a n ... lm nr =1 4 fc , 

where the ...i m are independent normals with mean zero and variance 
OKLiifcT 1 - Clearly 

v(x) T Cv(y) = e x - y , 
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so the zeros are uniformly distributed on R m with 

' m + 1 



m + l 

TT-— r 



zeros per unit volume. 



8. Complex zeros 



We now present the complex version of Theorem 7.1 and discuss some conse- 
quences. We define a complex (multivariate) normal vector to be a random vector 
for which the real and imaginary parts are independent identically distributed (mul- 
tivariate) normal vectors. 



Theorem 8.1. Let v(z) = (fo(z),... , f n (z)) T be any complex analytic func- 
tion from an open subset of C m to C™ +1 , let U be a measurable subset of C m , 
and let A be a random m x (n + 1) matrix. Assume that the rows of A are iid 
complex multivariate normal vectors with mean zero and covariance matrix C. 
The expected number of roots of the system of equations 

Av(z) = 

that lie in the set U is 
ml 



det 

u 



(logv(z) T Cv(z)) 



dzidzj 



[J dx l dyi . 



Sketch of proof. The proof is analogous to that of Theorem 7.1 but uses complex 
integral geometry [43, p. 342]. The volume of the projection of v(z) is calculated 
using the complex Fubini-Study metric [22, pp. 30-31]. □ 



If U is Zariski open, then by Bcrtini's theorem, the number of intersections is 
constant almost everywhere. This number is called the degree of the embedding (or 
of the complete linear system of divisors, if we wish to emphasize the intersections). 
From what we have seen, the volume of the embedding is this degree multiplied by 
the volume of complex projective space of dimension m. For example, the volume 
of the Veronese surface v : P(C 3 ) — > P(C 6 ), defined by 

v(x, y, z) = (x 2 , y 2 , z 2 , xy, xz, yz), 

is 4 x 7r 2 /2!. This corresponds to the fact that pairs of plane conies intersect at four 
points. 

For the univariate case, if the coefficients are complex independent standard 
normals, the zeros concentrate on the unit circle (not the disk!) as the degree 
grows. 

For the complex version of the polynomial considered in Sections 3.1.2 and 7.2, 
the zeros are uniformly distributed on complex projective space. Just as was ob- 
served for the real version of this example in Section 4.1, this uniformity is a con- 
sequence of (unitary) invariance of the homogeneous version of these random poly- 
nomials. But for the complex case more can be said: these polynomials provide 
the unique normal measure (up to a constant) on the space of polynomials that is 
unitarily invariant. A simple proof and discussion of this may be found in [30]. 
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8.1. Growth rates of analytic functions. Complex analysts know that there is 
a connection between the asymptotic growth of analytic functions and the number 
of zeros inside disks of large radius. Functions whose growth may be modeled by 
the function exp(r2; p ) are said to have order p and type r. Precise definitions may 
be found in [6, p. 8]. Let n(r) be the number of zeros of f(z) in the disk \z\ < r. 
If f(z) has at least one zero anywhere on the complex plane, then [6, Eq. (2.5.19)] 

/, ,x log n(r) 

(14) limsup & < p. 

r^oo logr 

It is possible [6, (2.2.2) and (2.2.9)] to compute the order and type from the 
Taylor coefficients of f(z) = a + a\z + a 2 z 2 + • • • , by using 

r-t r\ ,. -nlogn 

(15) p = hmsup- j r 

„^oo l0g|fl n | 

and 

(16) r = — limsupn|a„| p/ ". 

ep n — >oo 

We now illustrate these concepts with random power series. We shall restrict to 
the univariate case, and we shall assume that the coefficients are independent. 

Theorem 8.2. Let 

f(z) = a + a x z + a 2 z 2 + ••• 
be a random power series (or polynomial), where the ai are independent complex 
normals with mean zero and variances of > 0. Let 

<f>(z) = a 1 + a\z + a\z 2 -\ 

be the generating function of the variances, and assume that <p(z) has a non-zero 
radius of convergence. Let n(r) be the expected number of zeros of the random 
function f(z) in the disk \z\ < r. Then 

n(r) = ^ 1 °g < H 7 ' 2 )- 

Proof. Observe that v(z) T Cv(z) — <fi(zz) — <j>(r 2 ), where v(z) is the (infinite- 
dimensional) moment curve. Thus it is easy to check that 

^-Iog«(*) r Ct,(2) = l|r|log^(r»). 
This is multiplied by rdrdO/ir and then integrated over the disk \z\ < r. □ 

This theorem, together with the fact that the distribution of zeros is radially 
symmetric, completely describes the distribution of zeros for these random func- 
tions. In fact, n(r) is exactly the unnormalized cumulative distribution function 
for the absolute values of the zeros. 

As a simple example, let 

#*) = e 2 « p/2 . 

By applying the Borel-Cantelli Lemma [45, p. 253] to (15) and (16), we see that the 
random function f(z) has order p and type r with probability one. The theorem 
we have just established then gives 

n(r) = Tpr p . 

This result is reasonable in light of (14). 
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8.2. A probabilistic Riemann hypothesis. We conclude our discussion of com- 
plex zeros with a probabilistic analogue of the Riemann hypothesis. 

Theorem 8.3. Consider the random Dirichlet series 
(17) /( ,) = ai + + || + ..., 

where are independent complex standard normal random variables. This con- 
verges with probability one ifHe(z) > 1/2. Then the expected number of zeros in 
the rectangle 1/2 < x i < Re(z) < x<i, y\ < Im(z) < y2, is 

±(CV^l_C^Ell) {y2 _ yi) 

2n{((2x 2 ) C(2x 1 )J [y2 Vl> 

In particular, the density of zeros becomes infinite as we approach the critical line 
{z | Re(z) = 1/2} from the right. 

Proof. Following Section 3.2.5, we see that v(z) T Cv(z) = ((z + z), so the density 
of zeros is 

1 d 2 , _ , 

where x = Re(z). □ 

Since (17) converges with probability one for Re(z) > 1/2, one might try using 
random Dirichlet series to study the Riemann zeta function inside the critical strip. 
Unfortunately, as Section 3.2.5 and Theorem 8.3 suggest, random Dirichlet scries 
are more closely related to ((z + z) than to ((z), and so the penetration of the 
critical strip is illusory 

9. The Buffon needle problem revisited 

In 1777, Buffon showed that if you drop a needle of length L on a plane containing 
parallel lines spaced a distance D from each other, then the expected number of 
intersections of the needle with the lines is 

2L 
~ttD' 

Buffon assumed L = D, but the restriction is not necessary. In fact the needle may 
be bent into any reasonable plane curve and the formula still holds. This is perhaps 
the most celebrated theorem in integral geometry and is considered by many to be 
the first [43]. 

Let us translate the Buffon needle problem to the sphere as was first done by 
Barbicr in 1860 — see [51] for a history. Consider a sphere with a fixed great circle. 
Draw a "needle" (a small piece of a great circle) on the sphere at random, and 
consider the expected number of intersections of the needle with the great circle. If 
we instead fix the needle and vary the great circle, it is clear that the answer would 
be the same. 

Any rectifiable curve on the sphere can be approximated by a series of small 
needles. The expected number of intersections of the curve with a great circle is 
the sum of the expected number of intersections of each needle with a great circle. 
Thus the expected number of intersections of a fixed curve with a random great 
circle is a constant multiple of L, the length of the curve. To find the constant, 
consider the case where the fixed curve is itself a great circle. Then the average 
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number of intersections is clearly 2 and L is clearly 2ir. Thus the formula for the 
expected number of intersections of the curve with a random great circle must be 

L 

TT 

Of course the theorem generalizes to curves on a sphere of any dimension. 

To relate Barbier's result to random polynomials, we consider the curve 7 on 
the unit sphere in K" +1 . By Barbier, the length of 7 is tt times the expected num- 
ber of intersections of 7 with a random great circle. What are these intersections? 
Consider a polynomial p(x) = «„i™, and let pj_ be the equatorial S"^ 1 perpen- 
dicular to the vector p = (a , . . . , a n ). Clearly j(t) <G pj_ for the values of t where 
j(t) _L p. As we saw in Section 2, these are the values of t for which p(t) = 0. Thus 
the number of intersections of 7 with pj_ is exactly the number of real zeros of p, 
and the expected number of real zeros is therefore the length of 7 divided by tt. 
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